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ABSTRACT 

This report summarises the results of classifications and 
experiments performed by LARS/Purdue University for the Crop 
Identification Technology Assessment for Remote Sensing (CITARS) 
project. Background information describing the experimental 
design and procedures may be' found in reference 4 or 11. 

B’ifteen data sets were classified using two analysis pro- 
cedures. One procejdure used class weights while the other 
assumed equal probabilities of occurrence for all classes. In 
addition, 20 data sets were classified using training statistics 
fron. another segment or date. The results of both the local 
and non-local classifications in terms of classification and 
proportion estimation are presented in Part 1. 

Part 2 of the report describes several additional experi- 
ments performed to provide additional understanding of the CITARS 
results. These experiments investigated alternative analysis 
procedures, training set selection and size, effects of multi- 
temporal registration, the spectral dlscriminabllity of corn, 
soybeans, and "other," and analysis of aircraft multispectral 
data. 

Part 3 of the report summarizes the results and presents 
our overall conclusions. 
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Part 1. CITARS Analyses 

I. Introduction 

This section briefly describes the two analysis procedures 
followed by LARS in classifying the ERTS data for CITARS and 
presents the results of the classifications as measured by 
classification accuracy and proportion estimation. 

II. Data Analysis Procedures 

The CITARS data analysis procedures used by LARS were de- 
signed to be automated (capable of being programmed) and re- 
peatable with the Intent of minimizing the amount of subjective 
decision making on the part of the analysts. Subsequent tests 
have shown that different analysts following the procedures ob- 
tained identical results. This has the advantage of allowing 
comparison of results obtained by different analysts which is 
an important consideration in evaluating different data collec- 
tion or data processing technologies as in CITARS. It also has 
the potential for increasing the speed and volume of data anal- 
ysis relative to procedures involving the analyst to a greater 
degree. On the other hand, some performance may be sacrificed 
when the analyst is not permitted to tailor the analysis pro- 
cedure to the particular problem and data set. 

The analysis techniques used by LARS utilized the LARSYS 
Version 3 multlspectral data analysis system. Its theoretical 
basis and details of the algorithm implementation are described 
by Swain [1] and Phillips [2]. The analysis procedure was de- 
scribed in detail by Davis and Swain [ 3 ] and in Volume I of 
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the CITARS final report [4]. The procedures are designed to pro 
vide repeatable results, i.e., variation due to analysts is mini 
mized. Briefly, the analysis procedures consist of: 

A. Class Definition and Refinement 

Four major classes, corn, soybeans, wheat (for selected 

a 

missions) and all "other" ground covers were defined. These 
major classes were divided into subclasses where spectral vari- 
ability within a class was so great as to result in multimodal 
probability distributions for that class. Clustering quarter- 
section field centers was used to isolate the subclasses. For 
clustering all four ERTS bands are used. A systematic method 
which minimized the total number of subclasses while avoiding 
multimodal subclass distributions was used for interpreting 
Information on the separability of subclasses [Davis and Swain 
(3)]. 

B. Classification 

Each data set was analyzed using two versions of the maxi- 
mum likelihood classification algorithm. Gaussian probability 
density functions were assumed for both procedures. The first 
classification method, LARS/SPl, was the maximum likelihood 
classification rule assuming equal prior probabilities for all 
classes and subclasses. This is the rule which has been in 
common usage for remote sensing data analysis for some time. 

The second method, LARS/SP2 , used "class weights" pro- 
portional to the class prior probabilities. This approach is 
more nearly optimal given that the Bayesian error criterion 
(minimum expected error) is preferred. Class weights may be 


The results of the classification were displayed using a 
discriminant threshold of This low threshold eliminated 

only those data points very much different from the major class 
characterizations. Thresholded points were counted in the "other 
category. A computer program was used to generate results tab- 
ulations, in both printed and punchcard form, for training fields 
test fields, and test sections. 

III. Classification Results 

The classification results obtained by LARS are summarized 
in Tables 1-8. Classification accuracy (average and overall) 
and class bias and root mean square errors of proportion esti- 
mates are presented. Tables 1-4 present the results of the 
local recognition and Tables 5-8 show the non-local classifi- 
cation results. The statistical analyses of the classification 
results, along with those of EOD and ERIM, are presented and 
discussed in Volume IX and X of the CITARS final report and will 
not be repeated here, except for the comparison of the two 
analysis procedures used by LARS. 

The LARS/SPl procedure used a maximum likelihood Gaussian 
classifier which assumed that the frequency of occurrence of 
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each class was the same for all classes. The LARS/SP2 procedure 
was identical to the SPl procedure except unequal class weights 
(i.e., prior probability information) was used. The use of the 
"correct" values for the frequency of occurrence of each class 
will theoretically maximize the overall performance; that is, 
the proportion of the test pixels which are correctly classified. 
LARS/SP2 was designed to attempt to maximize overall performance. 

Statistical comparison of the overall results of the equal 
(SPl) and unequal (SP2) prior probability procedures indicated 
that the use of historical data as a basis for prior probabili- 
ties did not affect proportion estimation or classification ac- 
curacy significantly for either local or non-local recognition 
on the basis of average performance. However, in interpreting 
this result it must be remembered that LARS/SP2 was an attempt 
to maximize overall performance rather than average performance. 
However, in the case of CITARS the two procedures were not 
significantly different as measured by either overall or average 
classification accuracy. Therefore, the quality of the prior 
probabilities used should be examined. 

The unequal prior probabilities were based on the 1972 crop 
acreage estimates made by the USDA, Statistical Reporting -Service 
for each county. While it was expected that the probabilities 
derived from these figures would not be the true probabilities 
for 1973, it was expected that there would no be major change. 

The USDA figures were available only on a county basis, 
while CITARS examined only a 5 x 20 mile segment of each county. 
Furthermore , performance was examined on only 20 of the 100 sec- 
tions in the segment. Since the crop proportions varied 
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significantly from section to section, the crop proportions based 
on county estimates may not apply. Table 9 presents the actual 
proportions in the 20 sections of each segment and the class 
welgiits used in LARS/SP2. Examination of the data in Table 9 
shows that there was considerable difference between the two. 

A final observation is that the classifier may not be very sen- 
sitive to the differences between equal and non-equal weights 
which were actually present in the CITARS data. 

Our conclusion is that while prior probhbility information 
in the form of class weights should be used when available (as 
such usage has a sound theoretical basis), it may not in prac- 
tice give much, if any. Improvement in performance. Further 
tests to determine the sensitivity of the classifier to class 
weights are recommended. 
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TABLE 1. BIAS AND ROOT MEAN SQUARE ERROR OF PROPORTION ESTIMATES 
USING LARS/SPl FOR LOCAL RECOGNITION. 


CLASS BIAS ROOT MEAN SQUARE ERROR 


SEGMENT 
( PASS ) 

CORN 

SOYBEAN 

‘OTHER* 

OVERALL 

SEGMENT 

ESTIMATES 

AVERAGE 

OVER 

SECTIONS 

HU( 6) 

0. 157 

0.302 

-.459 

0.330 

0 . 2 92 

HU( 13) 

0. 061 

0.121 

-.182 

0.131 

0.157 

SH(12) 

0. 014 

-.038 

0.024 

0.02 7 

0.12 9 

SH(13) 

0.206 

-.057 

-.149 

0.151 

0.20 7 

WH( 10) 

-.058 

0.091 

-.033 

0.065 

0.109 

WH( 11) 

-. 046 

0.080 

-.034 

0.057 

0.150 

L I ( 5) 

0. 004 

-.005 

0,001 

0.004 

0.112 

LI( 7) 

-.013 

0.017 

-.004 

0.013 

0.097 

FA( 4) 

0. 127 

-.152 

0.025 

0.115 

0.180 

FA( 5) 

0, 185 

-.020 

-.165 

0.144 

0.192 

FA( 6) 

0, 17 9 

0.017 

-.196 

0.154 

0 .178 

FA( 9) 

0. 076 

0.145 

-.220 

0.158 

0.136 

LEI 5) 

0.014 

0.015 

-.029 

0.020 

0.111 

LE( 6) 

0.011 

-.034 

0.023 

0.025 

0.110 

LEI 8) 

0.02 9 

0.018 

-.047 

0.034 

0.118 

MEANS OVER 
SEGMENTS 

0.063 

0.033 

-.096 

0.095 

0.152 


BIAS = ESTIMATED - PHOTOINTERPRETEO PROPORTION 
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TABLE 2. BIAS AMD ROOT MEAN SQUARE ERROR OF PROPORTION ESTIMATES 
USING LARS/SP2 FOR LOCAL Pt'COGNITION. 


SEGMENT 
( PASS ) 

CORN 

CLASS BIAS 
SOYBEAN 

’OTHER* 

ROOT MEAN 

OVERALL 

SEGMENT 

ESTIMATES 

SQUARE ERROR 

AVERAGE 

OVER 

SECTIONS 

HU< 6) 

0.227 ; 

0.229 

-.456 

0.322 

0.281 

HU( 13) 

0. 177 

0.006 

-.183 

0.147 

0.182 

SH(12) 

0. 125 

-.069 

-.056 

0.089 

0.163 

SH( 13) 

0,044 

0.051 

-.095 

0.067 

0.148 

WH( 10) 

-.041 

-.002 

0.042 

0.034 

0.0 94 

WH(ll) 

-.062 

-.072 

0.134 

0.095 

0.146 

L n 5 ) 

0.014 

0.016 

-.031 

0.022 

0.131 

LI( 7) 

0.097 

-.098 

0.001 

0.079 

0.150 

FA( 4) 

0. 078 

0.014 

-.091 

0.070 

0.139 

FA( 5) 

0. 086 

0.140 

-.22 6 

0.162 

0.175 

FA( 6) 

0, 180 

-.007 

-*173 

0.144 

0.172 

FA( 9) 

0.092 

0.140 

-.232 

0.165 

0.141 

LE( 5) 

0.07 5 

0.219 

-.294 

0.216 

0.203 

LFli 6) 

0.069 

0.117 

-.187 

0.133 

0.142 

L E ( 8 ) 

0. 007 

0,125 

-.132 

0.105 

0.147 

MEANS OVER 
SEGMENTS 

0.078 

0.054 

-. 132 

0.123 

0.161 


BIAS = ESTIMATED - PHOTOINTERPRETED PROPORTION 
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i TABLE 3. CLASSIFICATION ACCURACY USING LARS/SPl FOR LOCAL 
L RECOGNITION. 


CLASSIFICATION ACCURACY 


SEGMENT 
{ PASS) 

CORN 

SOYBEAN 

•OTHER' 

AVERAGE 

OVERALL 

HU( 6) 

0.599 

0.910 

0.313 

0.607 

0.448 

HU (13) 

0.478 

0.471 

0.505 

0.484 

0.496 

SH( 12 ) 

0.498 

0.482 

0.527 

0.502 

0.498 

SH ( 13 ) 

0.640 

0.266 

0.245 

0,384 

0.485 

WH( 10 ) 

0.748 

0.841 

0.639 

0.742 

0.751 

WH( 11 ) 

0.545 

0.810 

0.471 

0.609 

0.612 

L I ( 5 ) 

0.618 

0.632 

0.512 

0.588 

0.599 

LI ( 7 ) 

0.691 

0.633 

0.777 

0.700 

0,673 

FA( 4) 

0.745 

0.235 

0.651 

0.544 

0.531 

FA( 5) 

0.864 

0.425 

0.325 

0.538 

0.511 

FA ( 6 ) 

0.968 

0.458 

0.433 

0.620 

0.592 

FA ( 9 ) 

0.790 

0.950 

0*652 

0.797 

0.796 

LE( 5) 

0.570 

0.634 

0.413 

0.539 

0.576 

LE( 6) 

0.641 

0,573 

0.462 

0.559 

0.583 

LE( 8} 

0.568 

0.536 

0.549 

0.551 

0,550 

MEANS OVER 
SEGMENTS 

0.664 

0.590 

0.498 

0.584 

0.580 

ACCURACY = PROPORTION OF 
IN A CLASS 

CORRECTLY 

CLASSIFIED PIXELS 



AVERAGE = AVERAGE CLASS ACCURACY 

OVERALL = PROPORTION OF CORRECTLY CLASSIFIED PIXELS 
OF ALL PIXELS CLASSIFIED 
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TABLE 4. CLASSIFICATION ACCURACY USING LARS/SP2 FOR LOCAL 
RECOGNITION. 




CLASSI FICATION 

ACCURACY 


SEGMENT 
(PASS ) 

CORN 

SOYBEAN 

•OTHER' 

AVERAGE 

OVERALL 

HU( 6) 

0.681 

0.889 

0.317 

0.629 

0.458 

HU( 13) 

0.669 

0.249 

0.513 

0.477 

0.491 

SH(12) 

0.623 

0.441 

0.463 

0.509 

0.551 

SH( 13) 

0.528 

0.367 

0.340 

0.412 

0 .459 

WH(IO) 

0.721 

0.808 

0.773 

0.767 

0.764 

WH(ll) 

0.489 

0.659 

0.618 

0.589 

0.579 

LK 5) 

0.582 

0.674 

0.510 

0.589 

0.607 

LI( 7) 

0.803 

0.552 

0.763 

0. 706 

0 ,663 

FA( 4) 

0.513 

0.444 

0.549 

0.502 

0 .502 

FA( 5) 

0. 850 

0.567 

0.292 

0.570 

0 .546 

FA( 6) 

0.958 

0.489 

0.535 

0.660 

0 .638 

FA( 9) 

0.762 

0.944 

0.615 

0.774 

0.772 

LE( 5) 

0.686 

0.825 

0.141 

0.551 

0 .669 

L E { 6 ) 

0.633 

0.716 

0.255 

0.535 

0.615 

L E ( 8 ) 

0.555 

0.641 

0.435 

0.543 

0.579 

MEANS OVER 
SEGMENTS 

0.670 

0.618 

0.475 

0.588 

0.593 


ACCURACY = PROPORTION OF CORRECTLY CLASSIFIED PIXELS 
IN A CLASS 

AVERAGE = AVERAGE CLASS ACCURACY 

OVERALL = PROPORTION OF CORRECTLY CLASSIFIED PIXELS 
OF ALL PIXELS CLASSIFIED 
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TABLE 5, BIAS AND ROOT MEAN SQUARE ERROR OF PROPORTION ESTIMATES 
USING LARS/SPl FOR NONLOCAL RECOGNITION. 


CLASS BIAS ROOT MEAN SQUARE ERROR 


TRAINING- 

CLASSIFIED 

CORN 

SOYBEAN 

•OTHER* 

OVERALL 

SEGMENT 

ESTIMATES 

AVERAGE 

OVER 

SECTIONS 

FA ( 

5)__ 

■FA( 

6) 

0. 129 

-.031 

-.098 

0.095 

0.159 

FA ( 

6) — 

■FA( 

5) 

0.189 

0.051 

-.240 

0.179 

0.186 

LE( 

5) — 

LE( 

6) 

-.007 

0.094 

-.087 

0.074 

0.12 8 

LE( 

6) — 

LE( 

5) 

-. 113 

0.002 

0.111 

0.092 

0.149 

HU( 

6) — 

LI ( 

5) 

0. 18 5 

0.030 

-.215 

0.164 

0.268 

HU{ 

6) — 

LE( 

6) 

-. 117 

0.298 

-.182 

0.213 

0.260 

LE( 

6) — 

LI( 

5) 

-.267 

-.070 

0.337 

0.2 52 

0.268 

LF( 

6) — 

HU( 

6 ) 

-. 126 

0.108 

0.018 

0.097 

0.204 

LI( 

7 ) — 

LE( 

8) 

0.093 

0.167 

-.259 

0.186 

0.181 

LE( 

8) — 

LI( 

7) 

-.037 

0.005 

0.032 

0.029 

0.151 

LI ( 

5) — 

FA( 

5) 

-.075 

-.240 

0.315 

0.233 

0.2 73 

FA( 

5) — 

LI ( 

5) 

-.225 

0.053 

0.173 

0.167 

0.257 

WH( ID — 

•SH(12) 

0.017 

-.105 

0.088 

0.080 

0 .143 

SH( 12) — 

■WH( 11 ) 

-.036 

-.035 

0.071 

0.050 

0 .122 

SH( 13) — 

■HU( 13) 

0.306 

-.038 

-.269 

0.236 

0 .2 64 

HU ( 1 3 ) — 

•SH( 13) 

0.06 8 

0.103 

-.171 

0.121 

0 .146 

FA ( 

6) — 

■HU( 

6) 

0.119 

0.140 

-.259 

0.183 

0 .2 54 

HU( 

6) — 

■FA( 

6) 

0.174 

0.241 

-.415 

0.294 

0.2 61 

WH(TO) — 

■FA( 

9) 

-. 142 

-. 116 

0.257 

0.182 

0 .236 

FA( 

9) — 

■WHT 10) 

-.221 

-.073 

0.294 

0.216 

0.195 

MEANS OVER 
RECOGNITIONS 

-.004 

0.029 

-.025 

0.157 

0.205 


BIAS = ESTIMATED - PHOTOINTERPRETED PROPORTION 


TABLE 6. BIAS AND ROOT MEAN SQUARE ERROR OF PROPORTION ESTIMATES 
USING LARS/SP2 FOR NONLOCAL RECOGNITION. 


CLASS BIAS ROOT MEAN SQUARE ERRO_R 


TRAINING- 

CLASSIFIED 

CORN 

SOYBEAN 

• OTHER • 

OVERALL 

SEGMENT 

ESTIMATES 

AVERAGE 

OVER 

SECTIONS 

FA ( 

5) 

— FA I 

6) 

0.066 

0.084 

-.149 

0.106 

0 .136 

FA( 

6) 

— FAI 

5) 

0.177 

0.055 

-.233 

0.172 

0 .177 

LF( 

5) 

— LEI 

6) 

-.043 

0.318 

-.275 

0.244 

0 .2 54 

LF( 

6 ) 

— LEI 

5) 

-.09 2 

0.114 

-.021 

0.086 

0.168 

HU( 

6) 

— LI I 

5) 

0.288 

-.074 

-.213 

0.211 

0 .309 

HU( 

6) 

— LEI 

6) 

0.037 

0.129 

-.166 

0.123 

0.155 

LF{ 

6 ) 

— LI I 

5) 

-.277 

0.032 

0.245 

0.214 

0.292 

LEI 

6) 

— HUI 

6) 

-.141 

0.161 

-.020 

0.t2'4 

0.228 

1. I( 

7) 

— LEI 

8) 

0.295 

-.091 

\ 3 

-.205 

0.214 

0.243 

LF( 

R ) 

— LI I 

7) 

-. 159 

0.232 

-.073 

0.168 

0.239 

LI ( 

5) 

— FAT 

5) 

-. 112 

-.265 

0.377 

0.274 

0.282 

FA( 

5 ) 

— LIT 

5) 

-. 135 

0.141 

-.00 6 

0.113 

0,245 

WH ( 1 1 ) 

— SHI12) 

-.025 

-.200 

0.224 

0.174 

0.189 

SHI 12) 

— WHI 11 ) . 

0,014 

-.042 

0.02 8 

0.031 

0 .117 

SHI 13) 

— HUT 13) 

0.071 

0.122 

-.193 

0.138 

0.185 

HUI 13) 

— SHI 13) 

0.278 

-.095 

-.183 

0.200 

0.234 

FA I 

6 ) 

— HU I 

6) 

0.217 

0.076 

-.293 

0.215 

0.267 

HUI 

6) 

— FAI 

6) 

0. 197 

0.209 

-.405 

0.287 

0.253 

WHI 10) 

— FAI 

9) 

-.141 

-.205 

0.346 

0.246 

0 .256 

FA I 

9) 

— WHI 10) 

-. 190 

-.097 

0.287 

0.207 

0.188 

MEANS OVER 
RECOGNITIONS 

0.016 

0.030 

-.046 

0.177 

0.2 21 


BIAS = ESTIMATED - PHOTOINTERPRETED PROPORTION 


TABLE 7. ACCURACY USING LARS/SPl FOR NONLOCAL 


TRAINING — 

CLASSIFIED 

CORN 

CLASSIFICATION 
SOYBEAN 'OTHER' 

ACCURACY 

AVERAGE 

OVERALL 

FA( 

5) 

--FAI 

6) 

0.885 

0.430 

0.487 

0.600 

0 

. 5 79 

FA{ 

6) 

—FAI 

5) 

0.934 

0.545 

0.418 

0.632 

0 

.609 

LE( 

5) 

—LEI 

6) 

0.634 

0.664 

0.212 

0.503 

0 

.5 84 

LE( 

6) 

—LEI 

5.) 

0.166 

0.620 

0.456 

0.414 

0 

. 42 1 

HU( 

6) 

—LII 

5) 

0.777 

0.413 

0.082 

0.424 

0 

.433 

HU( 

6) 

—LEI 

6) 

0.513 

0.774 

0.103 

' 0.463 

0 

.573 

LEI 

6) 

—LII 

5 ) 

0.020 

0.389 

0.583 

0.331 

0 

.333 

LEI 

6) 

—HUI 

6) 

0.172 

0.302 

0.576 

0.350 

0 

.478 

Lit 

7) 

—LEI 

8 ) 

0.687 

0.643 

0.168 

0.499 

0 

.5 89 

L.EI 

8) 

—LII 

7) 

0.644 

0.509 

0.856 

0.670 

0 

.604 

LII 

5) 

—FAI 

5 ) 

0.024 

0.031 

0.639 

0.231 

0 

.248 

FAI 

5) 

—LII 

5) 

0.147 

0.429 

0.244 

0.273 

0 

.302 

WHI ■ 

LI) 

—SHI 12 ) 

0.5 94 

0.377 

0.635 

0.535 

0 

.557 

SHI 12) 

— WHI 1 

1 ) 

0.329 

0.663 

0.482 

0.491 

0 

.478 

SHI 13) 

—HU 11 

3) 

• 0.541 

0.349 

0.428 

0.440 

0 

.431 

HUI ] 

L3) 

—SHU 

3) 

0.635 

0.359 

0.365 

0.453 

0 

.52 6 

FAI 

6) 

—HUI 

6) 

0.771 

0.275 

0.349 

0.465 

0 

.394 

HUI 

6) 

—FAI 

6) 

0.8 74 

0.737 

0.192 

0.601 

0 

.576 

WHI 3 

LO) 

—FAI 

9) 

0.024 

0.134 

0.687 

0.282 

0 

.30 6 

FAI 

9) 

—WHI 10) 

0.089 

0.608 

0.529 

0.409 

0 

.3 77 

MEANS OVER 








RECOGNITIONS 

0.473 

0.463 

0.425 

0.453 

0 

.470 


ACCURACY = PROPORTION OF CORRECTLY CLASSIFIED PIXELS 
IN A CLASS 

AVERAGE = AVERAGE CLASS ACCURACY 

OVERALL = PROPORTION OF CORRECTLY CLASSIFIED PIXELS 
OF ALL PIXELS CLASSIFIED 



13 



I 


TABLE 8. CLASSIFICATION ACCURACY USING LARS/SP2 FOR NONLOCAL 
RECOGNITION. 


{ 

<1 







CLASS 

;i FICATION 

ACCURACY 


{ 

1 

1 

TRAINING- 

CLASSIFIED 

CORN 

SOYBEAN 

•OTHER • 

AVERAGE 

overall 

i 

1 

! 

FA ( 

5) — 

FA( 

6) 

0.892 

0.626 

0.452 

0.65 6 

0.637 

1 

, j 

FA ( 

6 ) — 

FA( 

5) 

0. 92 0 

0.603 

0.494 

0.672 

0.653 

\ 

LE ( 

5) — 

LEI 

6) 

0.657 

0.855 

0,065 

0.526 

0.660 


LF( 

6 ) — 

LEI 

5) 

0.181 

0.751 

0.293 

0.408 

0.464 

i 

HIJ( 

6) — 

LI I 

5) 

0.835 

0,303 

0.082 

0.407 

0.399 

i 

HU{ 

6) — 

LEI 

6) 

0,598 

0.651 

0.109 

0.453 

0.549 

i 

LF( 

6) — 

LI I 

5) 

0.018 

0.449 

o 

. 

o 

‘"^0.25 7 

0.291 


LF( 

6 ) — 

•HU I 

6) 

0.166 

0.376 

0.533 

0.358 

0.458 

j 

LI ( 

7) — 

LEI 

B) 

0. 870 

0.419 

0.304 

0,531 

0.5 75 

i 

LF( 

B) — 

■LI I 

7) 

0.440 

0,745 

0.823 

0.669 

0 .659 

1 

L I ( 

3) — 

•FA I 

5) 

0.014 

0,014 

0.803 

0.277 

0.300 

, i 

FA( 

5) — 

LI I 

5) 

0,311 

0.536 

0.128 

0.32 5 

0.370 


• WH( 11) — 

SHI 12) 

0.525 

0.154 

0.719 

0.466 

0.483 


S H ( 1 2 ) — 

■WHI 11) 

0.391 

0.687 

0.417 

0.49 8 

0 .494 

f 

SH( 13) — 

•HU 1 13 ) 

0.280 

0.630 

0.545 

0.485 

0.523 

! 

HU( 13) — 

■SHI 13) 

0. 824 

0.114 

0.335 

0.424 

0.580 

\ 

} 

FA( 

6) — 

•HU I 

6) 

0.802 

0.386 

0.369 

0.519 

0.430 

! 

1 

HIM 

6 ) — 

■FAI 

6) 

0.888 

0.732 

0.233 

0.617 

0.592 

f 

i 

10) — 

•FAI 

9) 

0.031 

0.081 

0.799 

0.304 

0.331 

1 

1 

FA ( 

9) — 

■WHI 10) 

0. 105 

0,58 5 

0.514 

0.401 

0.372 

i 

MEANS OVER 
RECOGNITIONS 

0.4 87 

0.485 

0.416 

0.463 

0.491 

i 

1 

1 

II 

{J 

. H 


ACCURACY = PROPORTION OF CORRECTLY CLASSIFIED PIXELS H 

IN A CLASS \ |r 

AVERAGE = AVERA&f CLASS ACCURACY jl 

OVERALL = PROPORTION OF CORRECTLY CLASSIFIED PIXELS 
OF ALL PIXELS CLASSIFIED 

!' ' 


TABLE 9. WEIGHTS USED IN LARS/SP2 AND 

PHOTOINTERPRETED PROPORTIONS 


WEIGHTS USED IN LARS/SP2 


SEGMENT 

CORN 

SOYBEAN 

•OTUER* 

HUNTINGTON 

23.72 

23.92 

52.36 

SHELBY 

34.69 

22.16 

43.15 

WHITE 

31.45 

26.70 

41.85 

L IVINGSTON 

38.59 

37.75 

23.66 

FAYETTE 

14.15 

2?. 76 

62.09 • 

LEE 

37.91 

21.92 

40.17 


)i 


PHOTOINTERPRETED PROPORTIONS 


SEGMENT 

CORN 

SOYBEAN 

•OTHER' 

HUNTINGTON 

18.59 

22.07 

59.34 

SHELBY 

38.29 

24.30 

37.41 

WHITE 

36.28 

31.08 

32.64 

LIVINGSTON 

32.46 

37.75 

2 9.79 

FAYETTE 

19.43 

29,34 

51.22 

LEE 

33.22 

28.70 

38.07 
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Part 2. Additional Investigations 

I. Introduction 

Classification performances of 55 to 75 percent for test 
fields were obtained for CITARS; whereas, in previous ERTS 
investigations 75 to 95 percent correct crop identifications 
were reported [5 ,6 ,7 , 8 ] . Several additional special experi- 
ments were performed by LARS to determine the cause of unex- 
pectedly low classification performance and to determine possi- 
ble methods for improving the performance. Those experiments 
and results are discussed in this section. 

II. Factors Affecting Classification Performance 

Before describing the various experiments that were con- 
ducted, it may be useful to summarize possible factors affecting 
classification performances. They include: (1) the method of 

evaluation used, ( 2 ) the data analysis and classification pro- 
cedures used, ( 3 ) availability of training data, (4) registra- 
tion accuracy, (5) spectral characteristics of the scene, and 
( 6 ) characteristics of the ERTS data. 

A. Evaluation Method 

While actual ground observations of crop Identification 
were available for the fields used for training the classifiers, 
crop identifications for the test fields used to evaluate the 
classifications were determined by photointerpretation. Accurate 
identifications are, of course, required if a reliable measure 
of classification performance is to be obtained. Tests of the 
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4i 

photointerpretation accuracy were conducted and results indi- 
cated that the crops in 95-98 percent of the fields were correc- 
tly identified (4). Even this small percentage of errors, how- 
ever, likely led to some reduction in the estimate of classifi- 
cation performance, perhaps on the order of two to three percent. 
However, no further work has been 4one by LARS to determine either 

7 

the magnitude of photointerpretation errors or their effect on 
classification performance. 

B . Data Analysis lind Classification Procedures 

A, second factor which may have Influenced classification 
performance was the data analysis procedures used to develop 
training statistics. While CITARS was intended to evaluate the 
adequacy of currently available technology; in fact, in response 
to the requirement for using repeatable procedures capable of 
being programmed, it resulted in the use of new and unproven 
analysis techniques [3]. Although these procedures were well- 
thought out and based on several years’ experience in analyzing 
multlspectral scanner, they were first used on the CITARS data. 

The primary question concerning the procedures used by LARS was 
whether using automatic and repeatable procedures which reduced 
the number of decisions made by the analyst may have adversely 
affected classification performance. To answer this question 
several alternative analysis procedures were evaluated with the 
CITARS data. 

C . Availability of Training Data 

The supervised classification methods used for CITARS re- 
quire that fields with known crop identities be available for 
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traitJ/ing. In the case of CITARS, fields from 20-quarter sec- 
tions were potentially available for training purposes. This 
represented 20 percent of the total area for which the ground 
cover type was Identified, but the amount of training data avail- 
able Is generally more critical than the percentage since a 
minimum number of points Is required to adequately represent a 
class. As a rule of thumb the minimum is 10 times the number 
of features (channels) to be used In the classification or ^0 
for the CITARS data. While the original calculations of the 

\ 

number of points that would be available for training indicat ed\ 

) 

that there would be adequate numbers of points, the number // 

actually available was considerably smaller than anticipated. ' 
The acres, number of fields, and average field size for 
the 20-quarter sections are shown in Table 10. It can be seen 
that with average field sizes of only 15 to 35 acres that the 
maximum number of pure pixels from an individual field will 
generally be small. This problem was compounded by: (1) the 

criteria for sampling pixels from field centers (at least one 
whole pixel between the field boundary and any sampled pixel), 
(2) clouds and cloud shadows, (3) bad data lines, and (4) seg- 
ments only partially in the ERTS data. As a result of these 
conditions many training sets contained fewer data points than 
would have been desirable. And, in some instances classes had 
to be deleted because too few points were available to represent 
them. Therefore, an experiment to determine the effects of 
training set size and variability was performed. 



831 

336 

39 


618 

250 

25 


63 

25 

6 


986 

399 

54 


Huntington 


Shelby 


White 


Livingston 


Payette 


Lee 


Acres 
Hectares 
No. Fields 
Avg. Size 

(Acres) 21.2 

(Hectares) 8.6 

Acres 1888 

Hectares 764 

No, Fields 71 

Avg. Size 

(Acres) 26.5 

(Hectares) 10.8 

Acres I836 

Hectares 743 

No. Fields 42 

Avg. Size 

(Acres) 43.7 

(Hectares) 17.7 

Acres 1239 

Hectares 501 

No. Fields 33 

Avg. Size 

(Acres) 37.5 

(Hectares) 15.2 

Acres 733 

Hectares 297 

No. Fields 37 

Avg. Size 

(Acres) 19.8 

(Hectares) 8.0 

Acres 1498 

Hectares 606 

No. Fields 42 

Avg. Size 

(Acres) 35.6 

(Hectares) 14.4 


24.7 

10.4 

18,3 

10.0 

4.2 

7.4 

540 

323 

753 

218 

131 

305 

24 

15 

61 

22.5 

21.5 

12.3 

9.1 

8.7 

5.0 

510 

38 

954 

206 

15 

386 

13 

2 

4l 

39.2 

19.0 

23.3 

15.9 

7.6 

9.4 

1073 

39 

569 

434 

16 

230 

27 

2 

33 

39.7 

19.5 

17.2 

16.1 

7.9 

7.0 

287 

4l6 

1358 

116 

168 

550 

11 

26 

92 

26,0 

16,0 

14.7 

10.6 

6,5 

6,0 

813 

36 

620 

329 

15 

251 

31 

2 

34 

26.2 

18.0 

18,2 

10.6 

7.4 

7.4 
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D . Registration Accuracy 

To alleviate locating field and section coordinates in all 
data sets and to permit multitemporal data analysis, ERTS data 
from all available passes over each segment were spatially regis 
tered. For CITARS, the maximum allowable error in registration 
was 0.5 pixels as measured by the root mean squares of check- 
point residuals. With the guard row and column pixels of one 

whole pixel between actual field boundaries and selected sample 

\\ 

pixels any error in spatial registration should not affect clas- 
sification performance of field center pixels. Any registra- 
tion error, however, could affect the proportion estimates 
obtained from classifications of entire sections. To determine 
if there was any significant effect of registration on classi- 
fication performance, comparisons were made between registered 
and non-reglstered data for five segment-date combinations. 

E. Spectral Characteristics of Crops 

Accurate identification of crops by the methods used for 
CITARS requires that the crops and other cover types are sepa- 
rable based on their spectral characteristics. Classification 
performance, then, depends on the spectral separability of the 
cover types. An experiment was performed to evaluate the spec- 
tral dlscriminabillty of the cover types involved. 

P. Characteristics of ERTS Data 

Since accurate identification of crops by the methods used 
for CITARS requires that the cover types are separable based on 
their spectral characteristics, classification performance 
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depends not only on the spectral separability of the cover types 
but also on the ability of the scanner to measure spectral dif- 
ferences. An experiment was performed with aircraft scanner data 
having greater number, width, and dynamic range of spectral bands 
than the ERTS bands were to determine whether classification 
performance would be increased. 

III. Statistical Analysis of Results 
The statistical analyses used for the principal CITARS 
results were applied to the results of the additional investi- 
gations. Briefly, analysis of variance was used to determine 
if any differences in results were statistically significant 
and the Newman-Keuls Multiple Range Test was applied to deter- 
mine which treatments were different. 

For the analysis of test field classification performance 
results, the non-diagonal elements of the classification per- 
formance matrix were used. Since the elements of the estimated 
performance matrix are distributed blnomlally, the variance of 
the sum of the non-diagonal elements will be less dependent on 
the mean if the individual elements of the performance matrix 
are transformed [9]. A summation of transformed values was 
used as the variable for analysis of variance. The value of 
the variable was found by: 

^ — 1/2 
Z 7 T arcsin (e . . ) 


where e^^^ is an element of the classification performance 
matrix. (Summation is from 1 to 3 for the three cover types.) 
To evaluate the proportion estimates for the sections the 
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classification results proportions were compared to the propor- 
tions as determined by photolnterpretatlon. The accuracy of the 
proportion estimation is measured by 

Ic P 

i=l ^ ^ 

A 

where k is the number of classes, is the computer-estimated 
proportion of class 1, and is the proportion of class i as 
determined by photolnterpretatlon. In order to obtain more 
homogeneous variances, the Variable was transformed [91. The 
variable used for the analysis of variance was 

k A p 

InClOO Z (P.-P. )‘^+.02] 

1=^1 ^ ^ 

A detailed discussion of the statistical analysis of results 
can be found in Volume IX of this report [4], 

IV. Investigation of Alternative Analysis Procedures 
Introduction 

To accomplish the objectives of the CITARS experiment, the 
ADP procedures used to obtain classification results had to be 
well-defined (capable of being automated) and repeatable. Pro- 
cedures meeting these criteria would not be biased by analyst 
subjectivity. While this approach has certain advantages, it 
has the disadvantage that the analyst (s) could not tailor the 
procedure to the particular problem and data set. The objec- 
tive of this study was to determine if classification perfor- 
mance was adversely affected by the automated and repeatable 
data analysis procedure used for CITARS. 

To answer this question, several variations in the 
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procedure were applied to the same data set. Data for Lee County, 
Illinois collected August 5, 1973 (i’un 73120202) were used. 


This particular data set was chosen because the original classi- 
fication accuracy (60 percent) indicated that there was potential 


for improvement. 




/ 
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B. Description of Analysis Procedures 

Seven variations of the analysis procedure were applied. 

They are described in the following paragraphs and are summarized 
in Table 11. 

Procedure 1. The initial procedure is the one which was 

I — : 

utilized for CITARS and consists of the following steps: Three 
cover type classes were defined: corn, soybeans, and all "other" 
ground covers. When the major cover type classes were multi- 
modal, clustering was used to divide the classes into subclasses. 
The clustering algorithm used requires that the analyst specify 
the number of clusters to be found. The following rules were 
used to determine the number of clusters to request: for corn, 

request 5 , for soybeans 5 , agricultural "other" 10 , and non- 
agrlculturla "other" 3 for each identifiable subclass. There 
are two exceptions: determine the maximum number of clusters 

to request for each major class by dividing the number of data 
points available for clustering by 40; for the agricultural "other" 
or the non-agricultural "other," the minimum number of clusters 
is the number of identifiable subclasses, even if this minimum 
is greater than the maximum found in the previous exception. 

All four channels were used for clustering, and a statis- 
tics deck was punched from each cluster analysis, to be merged 
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later. Any cluster group having fewer than 25 points total was 
deleted from further consideration. After the classes were re- 
fined and the statistics decks merged into one, the data was 
classified using a Gaussian maximum-likelihood classification 
rule. Equal prior probabilities for all subclasses were assumed. 

The classification results were displayed in the form of 
maps and tables. Performances were tabulated for tnalnlng fields, 
test fields, and test sections. Pilot and test fields were com- 
bined for this investigation. 

In the remainder of this investigation, the procedures for 
class definition and refinement were varied. The same classifi- 
cation algorithm was used throughout and results were always 
tabulated for the same fields tind sections. 

Procedure 2 . The second test was verification of the 
repeatability of the analysis. Given the original training 
fields and the number of clusters to request, the analyst 
carried out the specified procedure. The results, as expected, 
did duplicate the results obtained the first time. The overall 
classification performance for test fields was 55.2 percent. 

Procedure 3 . For the next procedure, the only variation 
from the defined procedure was in the number of clusters requested. 
The guideline for the maximum number of clusters to request is to 
divide the number of data points for the class by 40. The quo- 
tients were 3.3 for corn, 2,75 for soybeans, and 9.9 for ’’other.’* 
Originally, three corn, two soybean, and nine ’’other" clusters 
were requested. The same quotients could have been interpreted 
to request three corn, three soybean, and 10 "other" clusters. 
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When these clusters were requested and the defined procedure was 
followed, overall performance was 55.3 percent. 

Procedure 4 . The next factor Investigated was number of 
training points. The number of training points originally pro- 
vided was 131 corn, 110 soybeans, and 396 "other," The analyst 
went back to an aerial photograph, an overlay defining fields, 
and field identification information to select more training 
points. The original criteria of using only points Inside a 
buffer zone of one line or column was relaxed. The total num- 
ber of training points used was 4l6 corn, 350 soybean, and 788 
"other." The defined procedure was followed for the classifi- ' 
cation using these points for training. Overall performance was 
56.4 percent. 

Procedure 5 . The next procedure varied from the defined 
procedure in several ways. One half of the original corn train- 
ing fields, one half of the original corn pilot fields, one 
half of the original corn test fields were randomly selected for 
training; also, one half of the original soybean training, test, 
and pilot fields were similarly selected. All of the additional 
training points selected in the previous procedure were also in- 
cluded. For clustering, five corn clusters and five soybean 
clusters were requested as before, but the "other" was handled 
differently. 

For clustering the class of "other/’ the analyst first 
divided the training points into the following categories: 
woods; urban, freeway, and other bare; pasture, small grain, 
and woods-pasture ; and water. Each of these subclasses of "other" 
was clustered separately. The number of clusters to request was 
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determined by dividing the number of data points by 40 (and 
rounding). Then the statistics from these six clustering jobs 
were merged into a single statistics deck. 

The analyst next ran the SEPARABILITY processor which cal- 
culates the statistical distance known as transformed divergence 
for all pairs of classes. The analyst then looked for class 
pairs having a transformed divergence less than 1000 (the maxi- 
mum possible value is 2000). Ther% were three such class pairs. 
The class pairs were (1) corn-2/woods-l , (2) corn-5/woods-2 , and 
(3) soybean-4/small grain-2, where corn-2/woods-l means subclass 
2 of class corn and subclass 1 of class woods. Since in each 
case the classes were from two different cover types, one of the 
classes was deleted from each pair. The criterion for deletion 
of subclasses was: delete the subclass of the cover type having 

more subclasses. That is, corn had five subclasses, and woods 
two, so for both corn-woods class pairs, the corn class was de- 
leted. Soybeans had five subclasses and small grain two, so for 
that class pair soybeans was deleted. This left three subclasses 
of corn, four soybean, two small grain, three woods , three urban 
and bare, and one water class, and none of these class pairs had 
a transformed divergence less than 1000, The area was then classl 
fied following the original CITARS procedures. Overall test field 
performance was 57.1 percent. 

Procedure 6 . The next procedure differed rather drastically 
from the standard CITARS procedure. The quarter sections were 
used as the basis for training. Due to computer core size limi- 
tations, not all quarter sections could be clustered at once, so 
the quarter sections were arbitrarily divided into three groups. 
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Asa.±h the jiroblem of number of clusters to request had to be 
solved. The problem was approached in the following v;ay: for , 

each group of quarter sections, clustering was run several times 

If 

with various numbers of clusters requested, SEPARABILITY was run 
on the statistics p.,f those clusters, and the set of clusters having 
the greatest pairwise minimum distance was chosen. 

For the first group of quarter sections, l6 clusters were 
requested; for the second, 12 clusters; and for the third, l6 
clusters. Statistics were calculated for each cluster and punched 
on cards for further use. 

The map output from CLUSTER was used in conjunction with . 
aerial photography, an overlay of field boundaries, and field 
identification information to identify the cover type associated 
with each cluster. The statistics from all the clusters were 
put into the SEPARABILITY processor, and again the transformed 
divergence measure was used as the criterion for pooling and 
deleting subclasses. The data was then classified in the normal 
way. Overall performance was 61.4 percent. 

Procedure 7 . Procedure 6 had achieved the best overall 
performance, and the best performance for the class corn, but 
procedure 5 had the best performance for soybeans, and the best 
training field performance for ’’other." For procedure 7 training 
classes from the procedure in which they had performed best were 
combined into a new training statistics deck. Again SEPARABILITY 
was run and transformed divergence used as a basis for pooling or 
deleting subcJ/asses. Overall classification performance for this 
procedure was 47.4 percent. 
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C. Results and Discussion 

The classification results are summarized in Tables 12 and 
13. None of the five alternative analysis procedures resulted 
in any significant Improvement in classification performance as 
measured by proportion estimates for sections. The sixth pro- 
cedure which involved clustering the quarter-sections gave Im- 
proved performance for corn and "other" test fields, but at the 
expense of soybean performance. Further investigation of that 
result, however, shows that too many pixels in the sections were 
classified as corn, too few as soybeans, and too few as "other." 
The seventh procedure gave Improved performance for "other" but 
low performances for both corn and soybeans. 

The conclusions drawn from these results are that (1) the 
CITARS procedures usdd by LARS produce repeatable results and 
(2) none of the alternative procedures tried resulted in any 
improvement in classification performance. While these results 
and conclusions are based on a relatively limited sample, it is 
probably safe to conclude that little if any of the generally low 
classification performances obtained in CITARS can be attributed 
to the data analysis procedures used. In the context of LACIE 
which will involve many analysts these results indicate that it 
is possible to use repeatable and relatively automatic analysis 
procedures without sacrificing classification performance. 
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Table 11. Summary descriptions of analysis procedures. 


Procedure Description 


1. Original analysis following defined procedure. 

2. Verification of repeatability. 

3. Defined procedure,, requesting different number of 
clusters for soybeans and other. 

4. Additional training points selected, then defined 
procedure followed. 

5. Extended set of training points, classes of • other 
separated before clustering, transformed divergence 
calculated for class pairs, one class of pair deleted 
for distances below threshold (1000). 

6. Quarter sections clustered, cluster maps used to 
identify clusters; transformed divergence used as 
criterion for pooling or deleting subclasses. 

7. Corn training from procedure 6 and soybeans and 
other training from procedure 5 used for training, 
transformed divergence criterion used for pooling or 
deleting subclasses. 
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Table 12. Summary of classification performances {% correct 
classification of test fields) for seven analysis 
procedures. 


Procedure 

Corn 

Soybeans 

Other 

Overall 

1 

57.1 

53.6 

55.4 

55.2 

2 

57.1 

53.6 

55.4 

55.2 

3 

55.8 

53.1 

60.8 

55.3 

4 

68.8 

50.8 

42.5 

56.4 

5 

47.9 

63.6 

60.2 

57.1 

6 

87.6 

37.1 

69.9 

61.4 

7 

37.2 

42.6 

88.2 

47.4 


Table 13. 

Average proportions 
present in 20 test 
seven analyses. 

of corn, soybeans, and "other' 
sections as determined from 

Procedure 


Corn 

Soybeans 

Other 


1 

36.1 

30.5 

33.^ 

2 

36.1 

30.5 

33.4 

3 

36 . 0 

28.5 

35.5 

4 

46.6 

24.2 

29.2 

5 . 

25.2 

31.2 

43.6 

6 

48.0 

12.4 

39.7 

7 

21.8 

15.7 

62.5 

Phot olnterpre ted 


21.8 

46.9 

Proportion 

31.3 
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V. Comparison of Training Sets 

A. Introduction 

One of the objectives of CITARS was an examination of the 
effect of varying the training set selection on classification 
performance. To meet this objective, two training sets, each 
containing 10 quarter-sections, were to have been available for 
comparison. However, as training fields were selected, it be- 
came obvious that 10 quarter-sections would not provide an ade- 
quate training sample, and the two sets were combined to provide 
the 20 quarter-section training set. 

In this experiment, two training sets were used to train the 
classifier - the ten "pilot" sections the the ten "test" sec- 
tions. The classification performance for each of these training 
sets was compared to the classification performance of the 20 
quarter-section training set. 

B. Procedures 

The ten data sets described in Table 1^ were selected for 
this experiment. They were first classified using the 10 
"pilot" sections as the basis for training the classifier, and 
then classified again using the 10 "test" sections as the basis 
for training. The analysis procedures were the same as for other 
classifications of ERTS data performed by LARS (l.e. LARS/SPl and 
LARS/SP2). The classifications based on "pilot" sections were 
compared to the regular CITARS classifications (based on "training" 
quarter-sections) by examining the overall classification perfor- 
mance of field center pixels from the 10 "test" sections. Simi- 
larly, the classifications based on the "test" sections were 
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Table l4 . Summary of data analyzed to determine effect of 

varying training set on classification performance 


Segment-Perlod-Pass 

Date 

ERTS Scene ID 

Huntlngton-III 

July 15 

1357-15590 

Llvingston-III 

July 16 

1358-16045 

Payette-III-2 

July 17 

1359-16105 

Lee-III-2 

July 18 

1360-16155 

Lee-IV 

August 5 

1378-16153 

White-V 

August 21 

1394-16042 

Payette-V 

August 21 

1394-16044 

Shelby-VI 

September 7 

1411-15581 

Huntlngton-VII 

September 24 

1428-15520 

Shelby-VII 

September 24 

1428-15523 
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compared to the regular GITARS classifications by examining 
the overall classification performance of field center pixels 
from the 10 ’’pilot'^ sections. The comparisons were made in 
this way to avoid biasing classification performance by test- 
ing on samples which were used in training the classifier. 

The variability of proportion estimation accuracy was evaluated 
using analysis of \’’ariance. 

C. Results and Discussion 

Overall performances obtained from the GITARS classifica- 
tions based on the "training" quarter sections and overall 
performances obtained from the classifications based on the 
ten "pilot" sections are shown in Table 15. For seven of ten 
cases, the "pilot" classifications had higher overall test 
performance (column 5) than the GITARS classifications 
(column 3). In only four instances (l.e. HU-III, LI-III, 

WH-V, and PA-V) could "pilot" overall test performance 
(column 5) be considered reasonably high (greater than 75 %). 

Two of these instances (HU-III and PA-V) had reasonably high 
GITARS overall test performance (column 3). 

Table 15 also shows the overall performances obtained 
from the classifications based on the ten "test" sections. 

The "test" overall test performance (column 7) was less than 
the GITARS overall test performance (column 2) were above 75^. 

The same random sampling scheme was used to choose the 
"pilot" and the "test" sections. Thus both sets of sections 
should represent the same population. However , comparisons 
between the second and third columns of Table 15 suggest that 


Source of Training Data 


Segment- 

Period 


Training Fields 



HU-III 

92.3 

28.4 

80.1 

LI -III 

78.1 

58.8 

60 . 6 

FA-III-2 , 

77.8 

52.9 

63.7 

LE-III-2 

80.2 

53.2 

61.7 

LE-IV 

75.5 

62.4 

49.9 

WH-V 

87.9 

75.8 

74.3 

PA-V 

90.5 

79.7 

79.5 

SH-VI 

77.1 

48.0 

51.8 

HU- VII 

81.2 

40.9 

68.2 

SH-VII 

73.5 

52.9 

43.8 

Mean 

81.4 

55.3 

63.4 


Pilot 

; 

fields 

Test Fields 

Classification 
Performance (?) 
Pilot Test 

Fields Fields 

Classification 

Performance (?) 
Test j Pilot 

Fields 1 Fields 

89.7 

78.7 

87.1 

72.7 

81.4 

76.1 

75.2 

71.7 

86.8 

69.7 

89.8 

73.7 

58.8 

64.3 

79.7 

54.8 

71.0 

57.0 

75.9 

59.2 

88.3 

80.7 

84.1 

67.0 

84.4 

86.3 

90.5 

85.2 

76.4 

49»2 

76.9 

58.0 

87.1 

66.8 

78.2 

60.5 

64.7 

51.6 

71.6 

61.3 

78.9 

/ ^ 

68 To 

80.9 

66.4 
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this cdhclusion is not alw^ijs true. In 4 cases (HU-III, FA-111- 
2, LE-IV, HU-VII), the entries in column 2 and column 3 of Table 
15 show differences ia-performance greater than 10^. In two 
additional cases (LE-III-2 and SH-VII)^ the differences are greater 
than Q%, These differences suggest that the "pilot" fields 
and the "test" fields were not always representative samples 
of the same population. 

The "pilot" fields, and also the "test" fields, were 
obtained from ten sections. Since ten sections have twice 
the area of twenty quarter-sections, one could expect the 
"pilot" fields (or the "test" fields) to contain twice as 
many pixels as the "train" fields. However, this was not the 
case. 

Table 16 gives the number of data points in each training 
set of the ten data sets used in this investigation. In only 
four cases, HU-III "pilot", LI-III "test", SH-VI "pilot", and 
HU-VII "pilot" were the number of points more than twice the 
number of points in the regular CITARS training set. Thus, 
the effect of training set size can not be fully evaluated. 

It is interesting to examine these four cases (HU-III, 

LI-III, SH-VI, and HU-VII) Table 15 in light of the number of 
points in each training set. For example, though the "pilot" 
training set of HU-III was more than twice the size of the 
"train" training set, the "pilot" overall test performance was 
78 . 7 ^, 1.4^ less than the CITARS overall test performance of 
80 . 1 ^ (column 3). The "test" training set of HU-III was less 
than 50 points bigger than the "train" training set, but the 
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"test" overall test performance was 72.7^» a gain of ^H.3% over 
the CITARS overall test performance of 28.^^ (column 2). These 
results suggest that the representativeness and adequacy of 
the training set is not a function of the training set size 
along. 

The proportion estimation accuracy was examined through 
analyses of variance. The "pilot" and the "train" training 
sets were not significantly different; however, the "test" 
and the "train" training sets were significantly different. 
Since both the "test" and the "pilot" training sets were chosen 
in the same way, the results of the analyses of variance 
suggest that the choice of training set can significantly 
affect proportion estimation accuracy. 
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VI. Effect of Multitemporal Registration 
on Classification Performance 

A* Introduction 

To enable classifications of multitemporal ERTS data and 
to alleviate having to locate section and field coordinates in 
each segment-date combination of data, the satellite passes 
over each segment were registered as part of the data prepar- 
ation phase [4, Volume 5, "ERTS-1 Data Preparation."] This 
experiment was performed to determine if registration had any 
effect on classification performance and if so, the magnitude 
of the effect. 

B . Procedures 

The experiment consisted of a comparison of crop classifi- 
cation performances obtained with registered and non-registered 
forms of ERTS data. Both forms of the data were geometrically 
corrected. Five segment-date combinations of data were selected 
for analysis. The coordinates of sections and fields used for 
the registered data were the same as used in the regular CITARS 
data classifications. The coordinates from approximately the 
same fields were located in the non-registered data by manually 
overlaying the photo overlays onto the ERTS imagery. A one-to- 
one correspondence of fields in both data sets was not used be- 
cause to do so would have eliminated several fields which were 
needed for training. However, about 80 percent of the fields 
were common to both data sets. The same procedure for selecting 
pixels from fields, i.e. one "guard" pixel between field boundary 
and any selected pixel, was followed in both cases. 
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The same classification procedures, l,e., LARS/SPl and SP2 , 
were applied to both the registered and non-registered data sets 
for all five segment-date combinations. Also, the non-registered 
data was classified with statistics from the registered data, and 
the registered data was classified with statistics from the non- 
registered data. Test and pilot fields were combined into a 
single test set, and test and pilot sections were combined. Re- 
cognition performances for fields and proportion estimates for 
sections were tabulated, and an analysis of variance was performed 
to determine if any significant difference existed between the 
registered and non-registered data. | 

C. Results 

Overall classification performances for test and pilot fields 
combined are shown in Table 17 for the five segment-date combin- 
ations. The results of the analysis of variance (a conservative 
test) indicated that there was no significant difference between 
the performance of registered and non-registered data. However, 
inspection of overall classification performances for test and 
pilot fields combined, summarized in Table 17, shows that Payette- 
III-l and Huntington-III had differences in perfdrmance of approx- 
imately 20 % between registered and non-registered results. Hunt- 
Ington and Fayette had the smallest, average field sizes, and it 
would be expected that the effect of any registration errors would 
be magnified for small fields. Prom this, it appears that average 
field size may be one factor affecting classification performance 
in registered data sets. 


Table 17. Overall classification performance of regflstered and 
non-registered forms of ERTS data. \ 


segment -date 

■ 

average field 
size (acres) 

without weights 

Non- 

Reg. 

Reg. 

Non- 
Reg . 
w/Reg. 
Stats 

Reg. 

w/Nen- 

Reg. 

Stats 

Fayette-II 

16.8 

42.4 

53.1 

49.5 

48.1 

Fayette-III-1 

16.8 

71.0 

51.1 

51.2 

69.7 

Livingston-IV 

30.7 

70.1 

67.3 

68.2 

68.4 

Whlte-V 

34.1 

76.2 

75.1 

76.7 

74.5 

Hunt Ingt on-III 

20.1 

66.1 

44.8 

48.0 

65.7 


with weights 


50.2 
5^.6 
66.3 
76 A 
^ 5.8 
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VII, Spectral Discrimlnability of Corn, 

Soybeans, and ''Other" 

A. Introduction 

In Section V the effects on classification performance of 
training set variation were discussed. In this section thV po- 
tential spectral discrimlnability of corn, soybeans, and "other" 
will be examined In the context of the level of classification 
performance which would be possible If the number of training 
points were not limited (l.e. if all fields were used for train- 
ing the classifier). Using all fields for training the classi- 
fier should provide an optimistic upper limit on classification 
performance and an indication of the true spectral dlscrlmin- 
ability of the cover types of Interest under the CITARS conditions 
(l.e. ERTS data for selected locations and times). By comparing 
these results to the original classifications It should also be 
possible to determine if classification accuracy was severely 
affected by the limitation of available training data. 

B. Procedures 

Ten data sets, described In Table 14 were selected for 
classification using all training, test, and pilot fields for 
training. The analysis procedure was the basic procedure used 
by LARS for CITARS classifications of ERTS data (l.e. LARS/SPl). 
Overall correct classification of field center pixels was used 
as the measure of classification performance. 

C . Results and Discussion 

Classification results obtained with the original training 
sets (fields from 20 quarter-sections) are compared in Table 18 




Table lb. Comparison of overall classification performance for 
classifications based on training statistics from 
training fields versus all fields classified. 


"Source of Training Data" 


Segment- 


Period- 


rass 


HU-III 

LI-III 

PA--III-2 

LE-III-2 

LE-IV 

WH-V 

PA-V 

SH-VI 

HU-VII 

SH-VII 


|Training Fields 

I 

Classification 

Results 

Training | Test* 


92.3 

78.1 

77.8 

80.2 

75.5 

87.9 

90.5 

77.1 

81.2 

73.5 


59.9 

59.3 

58.3 

55.0 

75.1 

79.6 
4^.8 

49.6 
48.5 


: ■■■■— ■■■■' ■ 

All Plelds 

Classification Results 

Training 

^'estf 

All Plelds 

83.1 

■ 

82.9 

82 . 9 

66.9 

70.8 

69.9 

72.9 

74.0 

73.6 

72.4 

44.3 ■ 

53.9 

68.3 

65.2 

66.3 

78.9 

77.1 

77.7 

83.5 

84.3 

84.0 

71.5 

65.9 

67.1 

72.6 

78.6 

77.3 

48.5 

48.4 

48.4 


*Test = test + pilot fields as defined for CITARS 
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with results obtained using all fields for training. The classi- 
fication results for all fields show that in some instances (i.e. 
HU-III, PA-V, WH-V, and HU-VII) reasonably high classification 
performance (greater, than 75/S) would be possible if adequate 
training data were available. In the remainder of data sets 
classified the low performances indicate that the cover types 
of interest are not spectrally separable in the ERTS bands. 

Comparison of the results for the four best classifications 
to the results of the original classifications of test + pilot 
fields shows that WH-V and PA-V (75.1 and 79.6 j' respectively) 
vrere classified reasonably well with the original training fields, 
but HU-III (44.8) and HU-VII (49.6) were not. This means that in 
at least two cases the original training fields were not repre- 
sentative of all fields in the segment and that performance was 
adversely affected by Inadequate or non-representative training 
sets. 

The results Indicate that there were two different situations 
present: (1) Por the available spectral bands, the spectral char- 

acteristics of the cover types of Interest were potentially dif- 
ferent enough to enable ’’good" classifications to be made; and (2) 
the cover types were sufficiently similar that accurate classifi- 
cations could not be obtained by methods currently available 
which rely only on the spectral information content of ERTS multi- 
spectral scanner data. In the former case the level of classi- 
fication accuracy actually achieved depends on the quantity and 
quality of training data; whereas, in the latter case performance 
is low (< 75 percent overall correct classification of test pixels) 
regardless of the amount and kind of training data available. 
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Of course, recognition might be improved in both cases by the use 
of temporal and/or spatial information. 

These conclusions are necessarily limited to the ERTS data, 
cover types, locations, and times considered in the CITARS experi- 
ment. In particular, it should be noted that the conclusions 
about the spectral separability of the cover types are based on 
the measurements made by the ERTS multispectral scanner. Evidence 
exists indicating that if the ERTS data had more spectral bands 
and/or greater dynamic range the separability of the cover types 
would be increased [10]. This question was Investigated by anal- 
yzing aircraft multispectral scanner data having more spectral 
bands and greater dynamic range for one of the CITARS segments. 
Results of that Investigation are presented in the following 
section of this report. 

VIII. Analysis of Aircraft Multispectral Scanner Data 
A. Introduction 

One of the original objectives of CITARS was to compare 

classification performances of ERTS-1 MSS data to aircraft- 

acquired MSS data. Aircraft scanner data was acquired by the 
2 

Bendix MS system for six missions and by the ERIM M-7 system 
for two missions. Subsequent resource and time constraints 
limited the analysis primarily to the ERTS data. The comparison, 
however, is still an important one to be made, particularly in 
light of the unexpected low performances obtained for the ERTS 
data classifications. With this background, one of the flight - 
lines of M-7 scanner data over the Payette Co ., Illinois segment 
was analyzed by LARS. 
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B. Procedures 

Both the ERTS and aircraft scanner data were collected over 
the Fayette Co. segment on August 21, 1973. The Payette data 

was selected primarily because of its availability for analysis 

(no Bendix M S data was available to LARS and only the data for 

the ERIM M-7 mission over Fayette Go. on August 21 had been 

digitized at the time of this analysis). The M-7 scanner data 
analyzed was collected over the western two-thirds of the segment 
(two passes were required to cover the entire segment) from an 
altitude of approximately 4,650 meters at 8 : 30 a.m. local time. 

The low solar elevation at the time of data collection caused 
severe sun angle effects readily apparent in the data. Therefore, 
a preprocessing algorithm for mean angle response correction was 
applied to the data before analysis. Also, because the flight was 
flown so early in the morning the utility of the thermal channel 
for providing crop dlscrlminablllty information was probably limited 
The aircraft scanner data had 12 wavelength bands and an instan- 
taneous field of view of approximately 12 meters compared to 80 
meters for ERTS data. The 12 wavelength bands are shown in Table 
19. 

Sixteen of the 20 quarter-sections and 19 of the 20 sections 
in the segment were contained in the aircraft data. Coordinates 
were obtained for a majority of fields present in the quarter-sec- 
tions and sections taking care to Insure that only "pure” field 
center pixels were sample. Training statistics were developed in 
the same manner as for the ERTS data analyses (l.e. LARS/SPl and 
LARS/SP2 were used). The only exception was that four of the 12 
available channels for classification were chosen based on the 


Table 19. Wavelength bands of the M-7 scanner 


Channe 1 

Wavelength Band 
(micrometers) 

Spectral Region 

1 

.41-. 48 

visible 

2 

.48-. 52 

visible 

3 

.50-. 54 

visible 

l\ 

.52-. 57 

visible 

5 

.55-. 60 

visible 

6 

.58-. 64 

visible 

7 

.62-. 70 

visible 

8 

.67-. 94 

near infrared 

9 

.71-. 73 

near Infrared 

10 

1.00-1.40 

near infrared 

11 

2.00-2.60 

middle infrared 

12 

9.30-11.70 

thermal infrared 
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maximum average pairwise transformed divergence of the classes. 

The four chsuinels with the greatest average pairwise divergence 

I - • * 

were .58-. 64, .71-. 73, 1.00-1.40, and 2.00-2.60 ym. The number 
of subclasses of corn, soybeans, ag "other" and non-ag "other" 
was two, two, five, and four, respectively, for the aircraft 
data. The number of subclasses of corn, soybeans, and "other" 
was two, four, and four, respectively, in the ERTS data. The 
classifications were performed with and without class weights 
and classification performance tabulated for training, test, 
and pilot fields. 

e. Results and Discussion 

Classification performance for field center pixels (test 
fields) for the ERTS and aircraft data are shown in Table 20. 
Although there were substantial differences for individual classes 
between the ERTS and aircraft data classifications, overall per- 
formance for the two data sets was nearly Identioal; performance 
for with weights and without weights classifications averaged 
78 percent for ERTS vs. 77 percent for aircraft. Use of class 
weights did not significantly affect performance for either the 
ERTS or aircraft data classifications. 

Another topic of interest is the wavelength bands indicated 
by the feature selection algorithm as best for discriminating 
among the training classes for the aircraft data. Table 21 shows 
the best five combinations of four, five and six channels. Every 
channel combination in the table includes at least one visible and 
two near infrared bands. In the combination of four channels, the 
remaining band was middle infrared, four out of five times. For 
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Table 20. Classification performance (percent correct) for 

field center pixels of ERTS-1 MSS data and aircraft 
MSS data, Payette Co., Illinois, August 21, 1973. 



Training Fields 

Test 

Fields* 

Class 

W/ Wts. 

W/0 Wts. 

W/ Wts. 

W/0 Wts. 



ERTS-1 MSS 

data 


Corn 

77.1 

80.0 

79.0 

76.2 

Soybeans 

"Other 

89.6 

89.1 

95.0 

94.4 

"Other" 

96.4 

96.9 

65.2 

61.5 

Overall 

90.5 

91.0 

79.6 

77.2 



Aircraft MSS 

data 


Corn 

83.7 

86.6 

69.1 

71.3 

Soybeans 

84.9 

85.9 

76.0 

76.0 

"Other" 

91.6 

91.3 

83.4 

83.3 

Overall 

86.7 

87.7 

76.9 

77.4 

*Test « test 

+ pilot 

fields 




48 


Table 21 Rank of channel combinations on basis of average divergence 



Channels 


Minimum Average spectral Regions 

Divergence Divergence 


Best five combinations 

of four 

channels , 


2,9,10,11 

1390 

1939 

V,NIR,NIR,MIR 

7,9,10,11 

1363 

1932 

V,NIR,NIR,MIR 

5,9,10,11 

1345 

1931 

V,NIR,NIR,MIR 

6,8,9,10 

1132 

1930 

V,NIR,NIR,NIR 

2,9,10,11 

1278 

1925 

V,NIR,NIR,MIR 

Best five combinations 

of five 

channels . 


6 , 8 , 9 , 10,11 

1457 

1963 

V,NIR,NIR,NIR,NIR 

7 , 8 , 9 , 10,11 

1456 

i 960 

V,NIR,NIR,NIR,MIR 

5,8,9,10,11 

1450 

1958 

V,NIR,NIR,NIR,MIR 

2,8,9,10,11 

1468 

1956 

V,NIR,NIR,NIR,MIR 

3,8,9,10,11 

14 17 

1954 

V,NIR,NIR,NIR,MIR 

Best five combinations 

; of six 

channels . 


6,8,9,10,11,12 

1499 

1969 

V,NIR,NTR,NIR,MIR,PIR 

2,6,8,9,10,11 

1493 

1968 

V,V,NIR,NIR,NIR,MIR 

11,6,8,9,10,11 

1498 

1968 

V,V,NIR,NIR,NIR,MIR 

1,6,8,9,10,11 

1508 

1968 

V,V,NIR,NIR,NIR,MIR 

4,7,8,9,10,11 

1491 

1967 

V,V,NIR,NIR,NIR,MIR 
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the combinations of five channels, the five best com|)lnatlons 
all Included the available reflective Infrared (three near and 
one middle), and the fifth channel was a visible band. The best 
five combinations of six channels also included the four reflec- 
tive infrared bands and a visible band. The remaining band was 
another visible four out of five times. Caution should be exer- 
cised in making any conclusions about the utility of the far 
Infrared (emissive infrared, or thermal) due to the fact that 
the data was collected at 8:30a.m. 

This comparison for one segment and time of ERTS and air- 
craft data classification performance Indicates that there was 
little If any difference between the two. However, this con- 
clusion was based on analysis of only one segment and time, 
further, the ERTS data classification had the highest classifi- 
cation accuracy of all the CITARS classifications and the air- 
craft scanner data was collected under suboptimal conditions 
with very low sun angle. In spite of attempts to ’’correct" or 
compensate for the sun angle problem, this is likely (because of 
its severity) to have had an adverse effect on classification per 
formance. The combination of these two effects may have brought 
the ERTS and aircraft data classifications closer together than 
they might be under other conditions. The classification perfor- 
mances obtained in this experiment with aircraft data do not 
approach those obtained in previous classifications of aircraft 
data ( 1 . e . , 1971 CBWE) . To better determine the level of classi- 
fication accuracy which could be anticipated from aircraft data 
in the CITARS context, performance of additional analyses is 
recommended. 


Part 3 . Summary and Conclusions 


The classification results obtained by LARS were presented 
in Parts 1 and 2 of this report. Part 1 contains the "regular" 
CITARS classification results and Part 2 describes the results 
of several additional investigations. Since the results of 
the statistical analyses are presented in Volume IX and 
discussed in Volume X of the final report along with results 
from EOD and ERIM, only the results specific to LARS have been 
discussed in this report. 

One of the Important results of CITARS at LARS has been 
the definition, implementation, and evaluation of an automat- 
able and repeatable data analysis procedure. The newly defined 
procedure was first used for CITARS, but it performed very 
well relative to other procedures both in terms of data 
analysis . efficiency and classification performance. The 
efficiency of the procedure is indicated by the fact that 
the 15 local and 20 non-local classifications using both the 
SPl and SP2 procedures were all completed by two part-time 
analysts in three months. The procedure was also shown to 
yield nearly identical results when used by several analysts 
on the same data sets. Subsequent tests showed that the 
performances obtained using the procedure were similar to 
those obtained using analyst dependent procedures. 

Statistical comparisons of the two LARS procedures, SPl 
and SP2, showed no significant difference between them as 
measured by either classification accuracy or proportion 
estimation. The procedure identified as SPl used equal 
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prior probabilities, while SP2 used unequal prior probabilities 
based on 1972 county acreage estimates by the Statistical Re- 
porting Service of the U.S. Department of Agriculture. 

There are three possible reasons why unequal prior probabil- 
ities did not produce significantly better results than equal 
prior probabilities: (1) the weights came from 1972, while 

data was from 1973, an^i the true proportions could have 
changed from one year to the next; (2) the weights pertain to 
counties but were applied to segments, which are fractions of 
counties and might therefore have different true proportions; 

(3) the analysis of variance was performed on results for 
sections, and sections vary within segments. 

Classification perforinances for CITARS were generally 
lower than originally anticipated. For this reason, several 
experiments were performed to Investigate the effect of various 
factors, and the results were presented in Part 2 of this 
report. Six factors which may have affected the performance 
were identified and investigated; (1) method of evaluation 
used, (2) data analysis and classification procedures used, 

(3) availability of training data, (4) registration accuracy, 

(5) spectral characteristics of the scene, and (6) character- 
istics of the ISRTS data. 

Evaluation of the classifications was based on crop 
identifications determined by photointerpretation. These 
identifications must be accurate if performance evaluation 
are to be reliable. Tests of photointerpretation accuracy 
Indicated that the crops in 95-98 percent of the fields were 
correctly Identified (5). It was therefore concluded that 


52 

photointerpretation errors did not substantially influence 
classification performance. 

To investigate the effects of the data analysis procedures 
used, an experiment was conducted using several alternative 
procedures. The alternative procedures did not result in 
improved classification performances, indicating that the 
generally low classification performances obtained in CITARS 
cannot be attributed to the data analysis procedures used. 

Another experiment was conducted to determine the effects 
of training set size and selection. Results showed that slgnlfl 
cant differences in classification performance can be obtained 
with different training sets, and that training set size alone 
does not determine the representativeness of a training set. 

Comparisons of classification performance for registered 
and non-registered data showed that there was no significant 
difference between the two forms of ERTS data. 

Classification performance depends largely on the degree 
of spectral separability of the cover types of interest. An 
investigation of the data characteristics showed that there 
were some cases in which the cover types of interest were 
spectrally different enough to enable discrimination among them 
(provided adequate training data was available). However, in 
other Instances the cover types were so spectrally similar 
(as measured by the ERTS system) that they could not be 
discriminated regardless of the amount of training data used. 

Since accurate identification of crops requires spectral 
separability, classification performance depends not only on 
the spectral characteristics of the cover types but also on 
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the ability of the scanner to detect and measure spectral 
differences. To study the effect of the ERTS scanner on 
classification performance, a data set collected by an air- 
borne multlspectral scanner system having more wavelength 
bands over a wider region of the spectrum and greater sensitiv- 
ity, and dynamic range was analyzed for comparison. Although 
there were substantial differences in performance for individual 
classes between the ERTS and aircraft data analyses, overall 
performance for the two data sets was nearly identical. 
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